AITopics | quadratic term

2510.03276

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Asia > China > Guangdong Province > Shenzhen (0.05)
Asia > China > Hong Kong (0.04)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Huahua Wang, Arindam Banerjee

Bregman Alternating Direction Method of Multipliers

Neural Information Processing SystemsOct-2-2025, 22:26:17 GMT

The mirror descent algorithm (MDA) generalizes gradient descent by using a Bregman divergence to replace squared Euclidean distance. In this paper, we similarly generalize the alternating direction method of multipliers (ADMM) to Bregman ADMM (BADMM), which allows the choice of different Bregman divergences to exploit the structure of problems. BADMM provides a unified framework for ADMM and its variants, including generalized ADMM, inexact ADMM and Bethe ADMM. We establish the global convergence and the O (1 /T) iteration complexity for BADMM. In some cases, BADMM can be faster than ADMM by a factor of O (n/ ln n) where n is the dimensionality. In solving the linear program of mass transportation problem, BADMM leads to massive parallelism and can easily run on GPU. BADMM is several times faster than highly optimized commercial software Gurobi.

badmm, bregman divergence, divergence, (13 more...)

Country:

North America > United States > Minnesota (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Neural Information Processing SystemsOct-2-2025, 03:51:18 GMT

1373b284bc381890049e92d324f56de0-AuthorFeedback.pdf

artificial intelligence, data selection, dataset, (16 more...)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.31)

arXiv.org Artificial IntelligenceMar-8-2025

A quantum annealing approach to graph node embedding

Djidjev, Hristo N.

Node embedding is a key technique for representing graph nodes as vectors while preserving structural and relational properties, which enables machine learning tasks like feature extraction, clustering, and classification. While classical methods such as DeepWalk, node2vec, and graph convolutional networks learn node embeddings by capturing structural and relational patterns in graphs, they often require significant computational resources and struggle with scalability on large graphs. Quantum computing provides a promising alternative for graph-based learning by leveraging quantum effects and introducing novel optimization approaches. Variational quantum circuits and quantum kernel methods have been explored for embedding tasks, but their scalability remains limited due to the constraints of noisy intermediate-scale quantum (NISQ) hardware. In this paper, we investigate quantum annealing (QA) as an alternative approach that mitigates key challenges associated with quantum gate-based models. We propose several formulations of the node embedding problem as a quadratic unconstrained binary optimization (QUBO) instance, making it compatible with current quantum annealers such as those developed by D-Wave. We implement our algorithms on a D-Wave quantum annealer and evaluate their performance on graphs with up to 100 nodes and embedding dimensions of up to 5. Our findings indicate that QA is a viable approach for graph-based learning, providing a scalable and efficient alternative to previous quantum embedding techniques.

algorithm, quantum, similarity, (14 more...)

2503.06332

Country:

North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)
North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Burnaby (0.04)
Europe > Bulgaria > Sofia City Province > Sofia (0.04)

Genre: Research Report > New Finding (0.48)

Industry:

Energy (0.93)
Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

arXiv.org Machine LearningMar-7-2025

Nearly Optimal Differentially Private ReLU Regression

Ding, Meng, Lei, Mingxi, Wang, Shaowei, Zheng, Tianhang, Wang, Di, Xu, Jinhui

In this paper, we investigate one of the most fundamental nonconvex learning problems, ReLU regression, in the Differential Privacy (DP) model. Previous studies on private ReLU regression heavily rely on stringent assumptions, such as constant bounded norms for feature vectors and labels. We relax these assumptions to a more standard setting, where data can be i.i.d. sampled from $O(1)$-sub-Gaussian distributions. We first show that when $\varepsilon = \tilde{O}(\sqrt{\frac{1}{N}})$ and there is some public data, it is possible to achieve an upper bound of $\Tilde{O}(\frac{d^2}{N^2 \varepsilon^2})$ for the excess population risk in $(\epsilon, \delta)$-DP, where $d$ is the dimension and $N$ is the number of data samples. Moreover, we relax the requirement of $\epsilon$ and public data by proposing and analyzing a one-pass mini-batch Generalized Linear Model Perceptron algorithm (DP-MBGLMtron). Additionally, using the tracing attack argument technique, we demonstrate that the minimax rate of the estimation error for $(\varepsilon, \delta)$-DP algorithms is lower bounded by $\Omega(\frac{d^2}{N^2 \varepsilon^2})$. This shows that DP-MBGLMtron achieves the optimal utility bound up to logarithmic factors. Experiments further support our theoretical results.

algorithm, excess risk excess risk, relu, (13 more...)

arXiv.org Machine Learning

2503.06009

Country:

North America > United States > California (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > Belgium > Flanders > East Flanders > Ghent (0.04)
Asia > China > Guangdong Province > Guangzhou (0.04)

Genre: Research Report > New Finding (0.67)

Industry: Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.45)

Huahua Wang, Arindam Banerjee

Bregman Alternating Direction Method of Multipliers

Neural Information Processing SystemsFeb-9-2025, 16:28:02 GMT

The mirror descent algorithm (MDA) generalizes gradient descent by using a Bregman divergence to replace squared Euclidean distance. In this paper, we similarly generalize the alternating direction method of multipliers (ADMM) to Bregman ADMM (BADMM), which allows the choice of different Bregman divergences to exploit the structure of problems. BADMM provides a unified framework for ADMM and its variants, including generalized ADMM, inexact ADMM and Bethe ADMM. We establish the global convergence and the O(1/T) iteration complexity for BADMM. In some cases, BADMM can be faster than ADMM by a factor of O(n/ ln n) where n is the dimensionality. In solving the linear program of mass transportation problem, BADMM leads to massive parallelism and can easily run on GPU. BADMM is several times faster than highly optimized commercial software Gurobi.

artificial intelligence, badmm, machine learning, (16 more...)

Country:

North America > United States > Minnesota (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Neural Information Processing SystemsFeb-8-2025, 02:33:47 GMT

Export Reviews, Discussions, Author Feedback and Meta-Reviews

Overview: This paper studies the benefits of augmenting the linear programming relaxation of the maximum a-posteriori (MAP) inference problem in graphical models with a quadratic term, thereby achieving strong convexity. Such augmented formulations are obtained both from the original primal and dual formulations, and in each case the resulting primal-dual relationship is studied. Prior work has mostly focused on smoothing the LP formulation using a softmax/entropy term, with a few notable exceptions, such as [5], [17] and [18]. Rather than those previous approaches, which employ a quadratic term in the sub-problems of either a *proximal* or a *alternating direction* scheme, in the present manuscript, the quadratic smoothing term is added directly. This can in some way be seen as a naive approach: In comparison to proximal or alternating direction schemes, convergence to the global optimum of the original problem is no longer guaranteed, and the approximation quality directly depends on the strength of the augmentation term.

author feedback and meta-review, formulation, relaxation, (10 more...)

Genre: Research Report (0.37)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.60)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.37)

arXiv.org Artificial IntelligenceMay-8-2024

QuadraNet V2: Efficient and Sustainable Training of High-Order Neural Networks with Quadratic Adaptation

Xu, Chenhui, Wang, Xinyao, Yu, Fuxun, Xiong, Jinjun, Chen, Xiang

Machine learning is evolving towards high-order models that necessitate pre-training on extensive datasets, a process associated with significant overheads. Traditional models, despite having pre-trained weights, are becoming obsolete due to architectural differences that obstruct the effective transfer and initialization of these weights. To address these challenges, we introduce a novel framework, QuadraNet V2, which leverages quadratic neural networks to create efficient and sustainable high-order learning models. Our method initializes the primary term of the quadratic neuron using a standard neural network, while the quadratic term is employed to adaptively enhance the learning of data non-linearity or shifts. This integration of pre-trained primary terms with quadratic terms, which possess advanced modeling capabilities, significantly augments the information characterization capacity of the high-order network. By utilizing existing pre-trained weights, QuadraNet V2 reduces the required GPU hours for training by 90\% to 98.4\% compared to training from scratch, demonstrating both efficiency and effectiveness.

neural network, quadratic adapter, quadratic term, (10 more...)

2405.03192

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Asia > China > Liaoning Province > Dalian (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Neural Information Processing SystemsMar-13-2024, 11:46:34 GMT

Bregman Alternating Direction Method of Multipliers

The mirror descent algorithm (MDA) generalizes gradient descent by using a Bregman divergence to replace squared Euclidean distance. In this paper, we similarly generalize the alternating direction method of multipliers (ADMM) to Bregman ADMM (BADMM), which allows the choice of different Bregman divergences to exploit the structure of problems. BADMM provides a unified framework for ADMM and its variants, including generalized ADMM, inexact ADMM and Bethe ADMM. We establish the global convergence and the O(1/T) iteration complexity for BADMM. In some cases, BADMM can be faster than ADMM by a factor of O(n/ ln n) where n is the dimensionality. In solving the linear program of mass transportation problem, BADMM leads to massive parallelism and can easily run on GPU. BADMM is several times faster than highly optimized commercial software Gurobi.

badmm, bregman divergence, divergence, (13 more...)

Country:

North America > United States > Minnesota (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Koike, Tomoki, Qian, Elizabeth

Energy-Preserving Reduced Operator Inference for Efficient Design and Control

arXiv.org Artificial IntelligenceFeb-7-2024

Many-query computations, in which a computational model for an engineering system must be evaluated many times, are crucial in design and control. For systems governed by partial differential equations (PDEs), typical high-fidelity numerical models are high-dimensional and too computationally expensive for the many-query setting. Thus, efficient surrogate models are required to enable low-cost computations in design and control. This work presents a physics-preserving reduced model learning approach that targets PDEs whose quadratic operators preserve energy, such as those arising in governing equations in many fluids problems. The approach is based on the Operator Inference method, which fits reduced model operators to state snapshot and time derivative data in a least-squares sense. However, Operator Inference does not generally learn a reduced quadratic operator with the energy-preserving property of the original PDE. Thus, we propose a new energy-preserving Operator Inference (EP-OpInf) approach, which imposes this structure on the learned reduced model via constrained optimization. Numerical results using the viscous Burgers' and Kuramoto-Sivashinksy equation (KSE) demonstrate that EP-OpInf learns efficient and accurate reduced models that retain this energy-preserving structure.

equation, initial condition, operator, (14 more...)

doi: 10.2514/6.2024-1012

2401.02889

Country:

North America > United States > New York (0.04)
Oceania > New Zealand (0.04)
North America > United States > Illinois (0.04)
(2 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)